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ABSTRACT 


Interactive  programming  environments  for  languages  offer  many  advantages 
over  traditional  batch-oriented  ones,  such  as  immediate  static  analysis.  One  form 
of  analysis  is  type  checking,  yet  type  checking  in  this  setting  for  languages  with 
common  features  like  overloading  has  received  little  attention. 

We  implement  an  interactive  type  checker  for  the  polymorphic  type  system 
of  ML  with  overloading.  The  implementation  was  produced  automatically  from  an 
attribute  grammar  using  the  Synthesizer  Generator,  an  attribute  evaluator  generator. 
Type  inference  then  is  accomplished  via  attribute  evaluation  so  that  if  the  evaluation 
is  done  incrementally,  then  type  inference  becomes  incremental  as  well. 
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I.  INTRODUCTION 


In  this  thesis,  we  assume  the  reader  is  familiar  with  basic  type  theory  and  its 
associated  notational  conventions.  We  also  assume  a  general  familiarity  with  the 
concepts  and  notation  of  the  lambda-calculus.  A  comprehensive  presentation  of 
these  concepts  can  be  found  in  the  texts  of  Thompson  [Tho91]  and  Gunter  [Gun92]. 

The  advantages  of  interactive  programming  environments  to  increase  program¬ 
mer  effectiveness  and  maximize  utilization  of  system  resources  are  significant.  For 
example,  during  program  development,  extensive  context-sensitive  type  checking  is  a 
valuable  tool.  The  immediate  recognition  of  type  errors  at  this  stage  could  yield  vast 
improvements  to  the  quality  and  reliability  of  today’s  software  products.  Valuable 
system  resources  would  be  preserved  through  decreased  waste  due  to  unnecessary 
re-compilations.  Perhaps  more  significantly,  the  advantages  of  providing  an  environ¬ 
ment  where  programmers  can  focus  on  the  fundamental  aspects  of  a  problem  with 
a  much  higher  degree  of  continuity  are  clear. 

The  study  of  type  inference  is  integral  to  this  effort.  Though  significant  ad¬ 
vances  have  been  made  in  this  research  area,  further  work  needs  to  be  done.  This 
thesis  considers  a  suitable  type  system  for  implementing  a  polymorphic  program¬ 
ming  language  with  overloading.  Utilizing  this  type  system,  an  implementation  is 
produced  that  performs  incremental  type  inference  in  am  interactive  environment. 

One  can  argue  that  system  ML  represents  the  current  state  of  the  art  in  type 
systems.  It  is  a  polymorphic  type  system  but  prohibits  the  use  of  overloading.  Yet 
the  need  for  overloading  in  programming  languages  is  well  known.  Current  imper¬ 
ative  languages,  such  as  Ada  and  C-f -f,  and  even  the  functional  language  standard 
ML  ,  allow  an  identifier  to  represent  different  types  but  the  resulting  programs  merely 
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contain  monomorphic  instances  overloaded  on  an  identifiers  name.  A  process  called 
overloading  resolution  is  required  to  assign  a  particular  type  to  an  identifier  based 
on  its  context.  Consider  the  following  expressions,  where  +  is  defined  over  integers 
and  reals: 

(a)  1+2 
{b)  1.0 +  2.0 

What  is  the  type  of  +?  We  only  know  that  it  can  have  the  type  int  — ►  int  — >  int 
or  the  type  real  —*  real  — >  real.  But  we  can  reliably  assign  neither  of  these  types  to 
+  without  first  examining  its  context.  In  a  polymorphic  language,  like  ML  ,  we  can 
assign  +  the  type  'i  a  .a  a  —*  a  but  this  results  in  +  having  too  many  types.  On 
the  other  hand,  if  we  assign  +  the  type  real  —*  real  — ♦  real  we  preclude  its  use  in 
expression  (a).  We  will  examine  these  issues  in  more  detail  in  Chapter  II. 

What  is  needed  is  a  means  to  express  a  type  for  +  which  encompasses  all  of 
its  possible  types  and  no  more.  We  can  do  this  with  the  use  of  constrained  type 
schemes.  We  can  then  assign  to  any  occurrence  of  +,  regardless  of  its  context,  the 
type  Va  mth{+  :  a  a  —*  a) .  a  -*  a  a.  This  means  that  +  can  assume  any 
finite  type  a  — ♦  a  — >  a,  with  a  instantiated  to  any  particular  type  for  which  +  is 
defined. 

Using  the  concept  of  constrained  type  schemes,  an  extension  to  system  ML  h^ls 
been  developed  incorporating  overloading  called  MLo.  The  associated  type  inference 
adgorithm  Wg  infers  principal  types  for  expressions  in  MLg.  It  turns  out  that,  unless 
we  place  restrictions  on  the  kinds  of  overloadings  we  can  express  using  constrained 
type  schemes,  typability  in  MLg  is  undecidable.  In  Chapter  IV  we  consider  a  form  of 
overloading  called  parametric  overloading  which  makes  typability  in  MLg  decidable 
and  present  an  algorithm  which  determines  satisfiability  of  constraints  with  respect 
to  a  parametric  assumption  set. 


A.  IMPLEMENTING  Wo 

Wo  performs  batch  type  inference.  In  this  respect,  it  is  unsuitable  for  direct 
incorporation  into  a  useful  interactive  programming  environment.  What  is  needed 
is  an  incremental  approach  to  type  inference  which  will  provide  immediate  feedback 
to  the  programmer  when  type  errors  are  encountered. 

One  might  attempt  to  rewrite  Wo  to  achieve  incremental  type  inference.  Our 
approach  is  to  utilize  the  formalism  of  attribute  grammars  to  express  Wg.  In  this 
setting  type  inference  is  performed  via  attribute  evaluation.  As  expressions  are  input 
a  corresponding  change  is  reflected  in  the  attribution.  If  we  are  able  to  perform 
attribute  evaluation  incrementally,  type  inference  can  also  be  done  incrementally. 
Furthermore,  it  is  implicit  in  the  formalism. 

We  present  an  implementation  of  Wg  utilizing  an  attribute  grammar  in  SSL,  the 
language  of  the  Synthesizer  Generator  of  Grammatech.  It  is  an  attribute  evaluator 
generator  that  takes  as  input  a  set  of  attribute  equations  and  returns  as  output 
an  attribute  evaluator,  or  in  our  case,  a  type-checker.  By  utilizing  the  Synthesizer 
Generator  for  our  implementation  we  are  not  only  able  to  produce  an  attribute 
evaluator,  but  one  in  which  attribute  evaluation  is  done  incrementally.  As  a  result, 
we  are  able  to  achieve  both  attribute  evaluation  and  type  inference  in  an  incremental 
setting.  Chapters  IV  and  V  discuss  details  of  the  implementation  amd  the  algorithms 
used. 


3 


II.  TYPE  SYSTEMS 


The  concept  of  type  systems  in  programming  languages  deals  with  a  set  of 
rules  which,  when  applied  to  terms  of  a  language,  produce  types  for  those  terms. 
The  notion  of  types  in  programming  languages  has  been  given  steadily  increasing 
importance  over  the  past  several  years.  It  is  clear  that  languages  with  rich  type 
classes  offer  programmers  more  flexibility  in  modeling  real-world  objects.  Yet,  there 
remains  a  significant  lack  of  consensus  as  to  what  types  are.  As  consensus  in  this  area 
is  critical  to  the  successful  application  of  type  theory  to  practical  implementations 
of  new  programming  environments,  this  chapter  outlines  the  most  important  aspects 
of  type  systems  and  their  application  to  this  thesis. 

A.  WHAT  IS  A  TYPE? 

When  discussing  types,  there  exists  a  tendency  to  confuse  the  distinction  be¬ 
tween  implementation  issues  and  the  underlying  nature  of  types  in  general.  Actual 
machines,  for  example,  provide  relatively  few  types  (i.e.  integers,  floating-point  num¬ 
bers,  pointers,  etc...  ).  The  implementation  of  types  in  a  high-level  language,  while 
posing  some  very  real  problems  in  the  area  of  compiler  design,  should  remain  dis¬ 
tinct  from  a  discussion  of  type  correctness  in  the  higher  context  of  the  meaning  of 
types.  With  reference  to  implementation  issues,  referred  to  as  Reductionist  type 
correctness.  Smith  states: 

The  key  issue  is  how  to  protect  the  representation  from  misuse.  [Smi91] 

In  this  thesis,  we  will  not  concern  ourselves  with  the  reductionist  view  of  types. 
Rather,  we  will  view  a  type  as  an  algebra,  a  set  of  values  and  operations  such  that 
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the  set  is  closed  under  these  operations.  For  example,  type  int  is  the  set  of  integers 
together  with  the  usual  arithmetic  operations,  but  the  set  of  natural  numbers  and 
the  predecessor  operation  do  not  form  a  type.  This  view  gives  us  a  fundamental  basis 
from  which  to  discuss  the  meaning  and  usage  of  types  in  programming  languages 
unencumbered  by  implementation  issues.  Operations  of  an  algebra  are  axiomatized, 
providing  then  a  semantics  that  one  can  use  to  reason  about  programs  in  which  they 
occur.  In  order  to  use  the  axioms,  however,  it  may  be  necessary  to  restrict  the  types 
of  certain  program  arguments  to  the  algebras  in  question.  For  example,  if  we  are  to 
prove  that  a  function  adds  1  to  its  argument  then  we  might  wish  to  fix  the  type  of 
its  argument  to  int,  say.  For  some  programs,  though,  reasoning  can  proceed  without 
fixing  argument  types.  Such  programs  are  called  polymorphic. 

B.  POLYMORPHISM 

Polymorphic  means  to  have  many  forms.  With  respect  to  programming  lan¬ 
guages,  this  refers  to  programs  or  terms  which  have  many  types,  or  can  operate  on 
values  of  many  types.  Perhaps  more  intuitively,  we  can  state  that  the  purpose  of 
polymorphism  is  to  allow  programs  which  use  a  single  name  to  operate  on  many 
different  types  of  inputs  and,  perhaps,  produce  different  types  of  output. 

We  will  first  be  concerned  with  a  form  of  polymorphism  called  parametric  poly¬ 
morphism,  where  polymorphic  entities  can  be  described  by  a  universally  quantified 
formula  with  all  quantification  at  the  outermost  level  (e.g.  Va.a  —*  a).  In  Figure  2.1, 
we  give  an  example  of  a  function,  length,  defined  in  a  generic  polymorphic  program¬ 
ming  language.  We  can  ascribe  to  length  type  Va.  list{a)  — ♦  int.  It’s  meaning  is  a 
function  which  given  a  list  computes  its  length. 

Languages  which  do  not  support  polymorphism  put  unnecessary  restrictions  on 
the  use  of  a  function.  Consider  the  Pascal  program  in  Figure  2.2.  Procedure  mtn 
has  the  type:  int  —*  int  — ♦  int.  Yet  there  is  nothing  inherent  in  min  which  depends 
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function  length(x) 

{ 

if  not  null(x)  then 
1  +  length (tail (x)) 
else 
0 

} 

Figure  2.1;  Polymorphic  length  function 

on  integer.  Replacing  integer  with  char  would  yield  a  correct  Pascal  program  with 
meaning  corresponding  to  the  lexicographic  ordering  of  characters. 

It  is  not  uncommon  for  the  claim  to  be  ma.de  that  Ada  is  a  polymorphic  pro¬ 
gramming  language,  as  in  [ASU86].  One  might  argue  that  it  is,  but  really  only  weakly 
so.  Through  the  use  of  generics,  one  can  define  a  template  for  representing  what  ap¬ 
pears  to  be  a  polymorphic  function.  In  the  example  of  Figure  2.3,  one  might  wish  to 
ascribe  the  type  Va.a  -+  a  — >  a  to  the  Ada  function  min  within  the  generic  package 
MINJ^KG.  This  would  indicate  that  mm  is  defined  over  all  instantiations  of  a, 
including  int  and  char.  This  is  obviously  not  the  case,  for  a  generic  package  cannot 
be  used  directly  in  Ada.  It  must  first  be  instantiated  for  a  particular  type  so  that  it 
can  be  properly  type  checked.  Though  the  language  provides  constructs  for  express¬ 
ing  polymorphism,  the  resulting  compiled  program  merely  contains  monomorphic 
instances  of  the  function  overloaded  on  the  identifier  min.  Research  into  providing 
polymorphism  in  an  imperative  language  is  ongoing  [Car87]. 

procedure  minCx.y  :  integer); 

begin 

if  x  <  7  then 
return (x) 
else 

return (y) 
end 

Figure  2.2;  Pascal  min  fimction 
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generic 

type  ITEM  is  private; 

with  function  "<" (left .right  :  ITEM) 

return  BOOLEAN  is  <>; 

packeige  MIN.PKG  is 

function  min(x,y  :  ITEM)  return  ITEM  is 
begin 

if  X  <  y  then 
return (x) 

else 

return (y) 
end  min; 
end  MIN.PKG; 

Figure  2.3;  Ada  generic  min  function 

It  is  clear  that  parametric  polymorphism  is  a  desirable  property  of  practical 
programming  languages.  Yet,  in  practice,  situations  arise  where  parametric  polymor¬ 
phism  alone  cannot  provide  us  with  the  means  to  express  certain  types  adequately. 
Consider  a  polymorphic  type  for  min  in  Figure  2.2.  Clearly  it  is  meaningful  for 
multiple  types.  However,  if  we  ascribe  the  type  Va.  a  a  -*  a  to  min,  terms  with¬ 
out  meaning,  such  as  min{true,  false),  become  t3q)able.  It  can  be  seen  that  min 
depends  on  “<”  being  defined  over  its  parameters.  What  is  needed  is  the  ability  to 
restrict  use  of  mtn  to  input  types  whose  values  are  partially  ordered.  In  other  words, 
we  need  to  be  able  to  overload  “<”30  that  min  is  polymorphic  yet  bounded  in  the 
types  of  arguments  to  which  it  can  be  applied,  a  form  of  bounded  polymorphism. 

C.  OVERLOADING 

The  common  view  of  overloading  is  stated  as  follows: 

An  overloaded  symbol  is  one  that  has  different  meanings  depending  on 
its  context  [ASU86]  (emphasis  added). 
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This  process  of  determining  the  meaning  of  an  expression  by  examining  its  context 
is  called  overloading  resolution.  This  is,  in  fact,  the  usual  way  of  treating  overloaded 
symbols  in  a  program;  demanding  that  the  local  context  of  an  overloaded  symbol 
determine  a  particular  overloading  to  be  used  at  each  occurrence.  This  kind  of 
treatment  is  used  even  in  the  polymorphic  language  ML  .  In  fact,  any  overloading 
that  requires  overloading  resolution  to  determine  its  meaning  is  termed  an  incoherent 
overloading  and  gives  rise  to  potential  semantic  ambiguity.  For  example 

*  :  real  real  —*  real, 

*  :  \/a.Tnatrix{a)  —*  matrix(a)  — »  matrix{a) 
is  an  incoherent  overloading  of  the  operator  ♦  where  *  stands  for  real  multiplication 
and  matrix  multiplication. 

Consider  a  term  Ax. Ay.  x  *  y.  We  can  infer  two  different  types  for  it:  real  — ♦ 
real  —*■  real  and  Va.matrix(a)  — »  matrix(a)  -*  matrix(a).  We  must  apply  the 
process  of  overloading  resolution  to  determine  the  meaning  of  the  term. 

A  more  desirable  form  of  overloading,  called  coherent  overloading,  arises  when 
an  overloading  is  constructed  in  such  a  way  that  its  various  instances  share  a  com¬ 
mon  semantics.  In  this  case,  overloading  resolution  is  not  required  to  ascribe  a 
unique  meaning  to  terms.  It’s  meaning  is  uniquely  determined  from  an  inspection 
of  the  axioms  for  the  operators  occurring  in  a  term.  For  example,  suppose  ♦  is 
commutative.  We  can  readily  see  that  our  overloading  in  (2.1)  is  incoherent.  For, 
although  we  can  derive  from  (2.1)  that  Ax.Ay. x*y  has  type  real  —*  real  —*  real  and 
Va .  matrix{a)  — *  matrix{a)  — ►  matrix{a),  matrix  multiplication  is  not  commuta¬ 
tive.  If  we  replace  our  second  assumption  on  *  with  *  :  int  int  —*  int  with  the 
meaning  of  integer  multiplication,  the  overloading  now  becomes  coherent  for  both 
integer  and  real  multiplication  are  commutative.  So  we  know  that  regardless  of  the 
types  of  X  and  y,  the  function  is  commutative. 
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As  will  be  seen,  while  our  implementation  of  Wg  does  not  prohibit  the  introduc¬ 
tion  of  incoherent  overloadings,  our  assumption  is  that  all  overloadings  are  coherent. 
If  this  assumption  is  invalid  with  respect  to  a  particular  overloading,  types  will  still 
be  correctly  inferred  for  expressions  involving  that  overloading.  However,  the  guar¬ 
antee  that  the  meaning  of  such  an  expression  is  uniquely  determined  is  lost. 

Surprisingly,  it  is  common  in  current  languages  to  introduce  incoherent  over- 
loadings  regardless  of  the  potential  for  semantic  ambiguities.  In  Ada,  for  example, 
the  operator  “/”  is  overloaded  with  different  meanings  of  integer  and  floating-point 
division. 

Overloadings  allowed  in  most  languages,  including;  Ado,  C++  and  standard 
ML  ,  are  restricted  to  being  finite.  In  the  MLg  type  system  this  restriction  is 
lifted.  For  example,  we  can  represent  an  infinite  overloadirr  •  ver  lists  under  equality 
as :  Va  with  =  :  a  —*  a  —*  bool .  list{a)  —*  list{a)  —*  bool.  In  this  case,  if  =  has  an 
instance  at  r  — »  t  — »  bool,  then  it  also  has  an  instance  at  list{T)  — ♦  list{T)  —*  bool. 
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III.  THE  ML  TYPE  SYSTEM  AND 
OVERLOADING 

In  this  chapter  we  consider  an  extension  of  a  Curry-style  typed  lambda  cal¬ 
culus  (Ao)  with  type  schemes  called  System  ML  .  As  mentioned  previ  a  type 
scheme  represents  parametric  polymorphism,  implying  that  all  quantih  -u  must 
be  outermost,  or  shallow.  Research  aimed  at  removing  this  restriction  is  described 
in  [Lei83,  McC84,  KT90]. 

A  free  identifier  may  be  denoted  as  having  infinitely  many  types  via  a  <^vpe 
scheme.  For  instance,  the  primitive  LISP  operation  hd  may  be  given  the  type: 
Va.  seq(a)  — >  a  which  would  indicate  that  for  any  choice  of  a,  say  r,  hd  has  the 
type:  8eq(r)  -♦  r. 

System  ML  preserves  the  property  of  principal  types;  every  typable  term  has  a 
principal  type,  one  that  is  more  general  than  any  other  type  derivable  for  the  term. 
For  instance,  the  term  X f.Xx.fx,  f  and  x  occurring  free,  would  have  as  principal 
type  Va. V^, [a-*  0)  — » (o  — »  0).  This  is  regarded  as  the  most  general  typing  for 
this  expression.  This  means  that  any  type  whatsoever  of  Xf.Xx.fx  can  be  derived 
from  the  type  Va.V/9.  (a  —*  0)  {a  -*  0)  by  suitably  instantiating  a  and  0] 
formally,  we  say  that  all  the  types  of  Xf.Xx.fx  are  instances  of  the  principal  type. 
The  existence  of  principal  types  means  that  a  type  inference  algorithm  will  always 
compute  a  unique  “best”  type  for  a  program. 

In  order  to  retain  principal  types,  lambda  abstraction  in  System  ML  ,  as  in  Aq, 
is  monomorphic.  This  means  that  lambda-boimd  identifiers  within  a  A-expression 
cannot  be  assigned  multiple  types.  Consider  the  expression  {Xx.x{Xy.y))Xz.z.  This 
expression  is  typable  in  System  ML  with  principal  type  Va.a  —*  a.  This  conforms  to 
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the  restriction  on  l2nnbda  abstraction  since  x,  while  being  able  to  assume  infinitely 
many  values,  has  polymorphic  type  Va.or  — ♦  a.  The  restriction  is  manifest  when 
an  attempt  at  self-application  is  made  within  a  A-expression.  For  example,  a  term 
such  as  {Xy.yy)Xx.  x  is  illegal  in  System  ML  .  Here,  y  must  be  able  to  assume  two 
different  types;  (a  -+  a)  and  a  for  some  particular  a.  This  results  in  the  term  Ay.  yy 
having  type  Va.(V/3.;3  — » /3)  — >  a,  which  is  not  a  principal  type. 

In  order  to  allow  free  identifiers  denoting  polymorphic  values  to  be  aissigned 
multiple  types,  one  uses  the  let  construct.  The  above  expression  can  then  be  rep¬ 
resented  as  let  y  =  Ax.  x  in  y  y.  This  involves  no  inner  quantification,  since  each 
instance  of  y  is  replaced  with  Ax.  x  in  determining  the  type  for  yy. 

System  ML  ,  like  Ao,  has  a  decidable  typability  problem.  In  other  words,  if  a 
type  exists  for  a  program  (there  may  be  more  than  one),  the  type  inference  algorithm 
will  be  able  to  infer  a  correct  type  for  it.  Conversely,  if  a  type  does  not  exist,  the  algo¬ 
rithm  is  capable  of  making  that  determination.  System  ML  is  also  widely  accepted 
and  has  been  incorporated  into  mainstream  languages  like  Standard  ML  [HMM86] 
and  Miranda  [Tur86].  Yet,  an  obvious  and  practical  limitation  exists  in  System  ML 
that  prohibits  overloading  by  restricting  the  number  of  assumptions  per  identifier 
in  a  type  assumption  set  to  at  most  one.  Milner  himself  makes  the  comment  in  his 
1978  paper  [Mil78]  that  allowing  more  than  one  assumption  is  desirable. 

An  extension  to  the  ML  type  system  has  been  developed  called  AfZ«[VoS91].  It 
retains  principal  types  and  allows  overloading.  Deviations  from  System  ML  include 
the  introduction  of  constrained  type  schemes  and  modifications  to  the  type  instanti¬ 
ation  and  generalization  rules.  Many  extensions  of  System  ML  have  been  proposed 
to  incorporate  overloading.  Among  these  are  the  systems  of  [Kae88,  CD091,  Smi91, 
Kae92,  Jon92]  and  those  related  to  the  development  of  the  functional  programming 
language  Haskell  [WaB89,  CH092,  NiP93].  All  of  these  type  systems  share  the  no¬ 
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tion  of  a  constrained  type  scheme  in  various  forms.  A  critique  of  these  type  systems 
is  given  in  [Vol93b]. 

A.  MLo 

Given  a  set  of  type  variables  (a,  ;9,7, ...)  and  a  set  of  type  constructors  (int, 
real,  bool,  list,. . .  ^  of  various  arities,  the  set  of  unquantified  types  is  defined  by: 
r  ::=  Q  I  T  -»  r  I  xCi"!.  •  •  • » ^n) 

The  set  of  quantified  types  or  type  schemes,  then,  is  defined  by 
<T  ::=  V(ai,...,an)  with  (xi  :  :  t^).  t, 

where  ai, . . . ,  q„  is  the  set  of  quantified  variables  of  <t,  ii  :  Ti,  . . . ,  i,„  :  is  the  set 

of  constraints  on  <t,  and  r  is  the  body  of  <7.  If  there  are  no  quantified  variables,  the 
“V”  may  be  omitted.  If  there  are  no  constraints,  the  “irith”  may  be  omitted.  In 
our  terminology,  <r  will  always  be  reserved  to  represent  a  type  scheme,  a  denotes  an 
abbreviation  for  ai, . . . ,  an  and  C  will  be  used  to  represent  a  list  of  constraints.  The 
most  general  form  of  a  type  scheme  is  then: 

<r  ::=  V  a  roith  C.  t 

A  substitution  is  a  set  of  replacements  for  type  variables  applied  simultaneously 
to  all  type  variables.  For  example: 

[Oj O^n  •“  1  •  •  •  >  ^n] 

is  a  substitution  where  all  of  the  a^’s  are  distinct.  The  substitution  is  applied  to  a 
type  T  by  simultaneously  replacing  all  of  the  a^’s  with  the  corresponding  r^’s.  The 
application  of  substitution  5  to  type  r  will  be  denoted  by  r  S. 

Two  new  type  assignment  rules,  (V-intro)  and  (V-elim),  are  given  in  Figure  3.1; 
these  represent  extensions  to  System  ML  developed  to  accommodate  overloading.  It 
should  also  be  noted  that  if  the  constraint  list  C  is  empty,  these  two  extensions  are 
identical  to  type  generalization  and  instantiation  in  system  ML  [Mil78,  DaM82]. 
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(hypoth) 

(-♦-intro) 

[ 

I  (-+-elim) 

i 

1 

(let) 

I 

(V-intro) 

(V-elim) 


A\-  X  :  (T,  if  X  :  (T  €  A 

AfU  {x  :  t}  h  M  :  t' 

A  h  \x.M  :  T  —¥  t' 

A\-  M  :t  —*  T‘y  A\-  N  :  T 
Ah(MN):r' 

A  h  M  :  cr,  A^  U  {x  :  <7}  h  N  :  r 
A  h  let  X  =  M  in  JV  :  T 


AUCh  Af  :t',  AhClQ:=f] 

A  I-  A/  :  Vd  with  C.t' 

Ah  M  :'^Q  with  C.t\  a  I-  C[a  := 
Ah  M  :  r'[d  :=  f| 

Figure  3.1:  System  MLo 


not  free  in 
T) 


A) 


Consider  a  term  M  —  Ai.  Ay.  ((x  ♦  x)  =  y)  which  contains  free  identifiers  *  and 
=,  and  the  following  assumption  set. 

*  :  real  -*  real  -*  real, 

*  :  int  -*  int  -*  int, 

—  :  int  -*  int  -*  bool, 

=  :  Vo  with  ♦  :«-+«— ♦a,  =  :a-+a-»  bool .  /tst(a)  — ♦  list{a)  — ►  bool 
Here  we  show  a  derivation  of 


A  I-  Ai.Ay.((x  ♦  i)  =  y) ;  Vo  with  *  :  a  -*■  a  -*  a,=:  a  -*  a  -*  bool  .a-*a-*  bool 

in  MLo- 

(1)  U  {•  !  at  -*  or  -*  ot}  U  {=:  a  -»  a  tool}  U  {»  :  a}  U  {y :  o}  h  *  :  o  (hypoth) 

(2)  it  U  {•  :  o  -•  or  -*  a}  U  {=t:  «-»«-»  tool}  U  {r  ;  or}  U  {y :  a}  K  •  :  at  -►  at  -►  at  (hypotl>) 

(3)  A  U  {•  :  a  — »  o  -*  a}  U  {as:  at  -*  a  -»  tool)  O  {»  :  o)  U  {y :  at}  h  (•*)  :  o  — *  a  {— ^dim) 

(4)  it  U  {•  :  o  -♦  at  -♦  a}  U  {s:  at  -•  at  -*  tool}  U  {x  :  a}  U  {y  :  o}  h  (x  •  s) :  a  (-*-elim) 

(5)  Au{«:a-««i'-*a}u  {»;  a  -•  a  — »  tool}  O  {x  :  a}  U  {y  :  a}  h  y  :  o  (hypoth) 

(6)  AU  {•:«-•«-♦  a}  U  {ss:  at  — ►  at  -•  tool}  U  {x  ;  a}  U  {y :  or}  o  -*  o  -*  tool  (hypoth) 

(7)  AU'(«:a— •a-*a}U  {a:  a  -*  a  -•  tool}  U  {x  :  a}  U  {y :  a}  hs  (x  •  x) ;  a  -•  tool  (hypoth) 

(8)  A  U  {•  :  a  01  -*  a}  U  {a;  a  -•  a  tool}  U  {x  :  a}  U  {y  :  a}  h  (x  «  x)  a  y  ;  tool  (hypoth) 

(9)  A  U  {•  :  a  -*  a  -*  a}  U  {a;  a  -•  a  -*  tool}  U  {x  :  o}  K  Ay.((x  •  x)  a  y)  :  a  -•  tool  (-^intio) 

(10)  A  O  {• :  o  -»  a  -*  0}  U  {a:  a  a  -*  tool}  >■  Ax.Ay.((x  •  x)  a  y)  :  a  -*  a  -*  tool  (-^iatro) 

(11)  Ah{*:a-»a-»a}U  {a:  a  — •  a  -*  tool}la  :a  i»|]  (hypoth) 

(12)  Ah  Ax.Ay.((x •  x)  a  y) ;  Va  with  o  :  a  a  a,s:  a  -*  a  tool . a  -*  a  -»  tool  (V-intro) 


We  are  required  to  introduce  assumptions  about  *  and  =  in  our  derivation  in 
order  to  arrive  at  a  type  for  Af.  However,  for  our  derivation  to  succeed,  we  need 
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to  be  able  to  discharge  those  assumptions  via  the  V-intro  rule.  This  ensures  that 
Af  can  be  derived  from  only  our  initial  assumption  set  A.  The  V-intro  rule  deviates 
from  generalization  in  system  ML  in  that  it  requires  that  the  constraint  set  C  be 
derivable  from  the  initial  assumption  set  A.  This  ensures  that  C  is  satisfiable  with 
respect  to  A.  In  our  derivation,  we  can  see  from  (11)  that  satisfiability  is  achieved 
by  substituting  int  for  a.  In  general,  there  can  be  more  than  one  finite  type  which 
satisfies  this  requirement.  For  example,  if  =  were  defined  for  reals,  both  int  and 
real  could  be  used  in  our  substitution  for  a.  Conversely,  it  is  not  always  possible  to 
achieve  satisfiability.  For  instance,  if  we  removed  the  second  assumption  on  *  from 
A,  our  derivation  would  end  at  (10).  There  would  be  no  single  substitution  for  a 
which  could  satisfy  the  overlapping  constraint  requirements  in  (11)  and  we  would 
conclude  that  M  is  untypable  with  respect  to  A. 

This  requirement  for  satisfiability  of  constraint  sets  ensures  that  the  type  system 
MLo  is  sound.  It  is  interesting  to  compare  MLo  to  a  similar  extension  to  system  ML 
proposed  by  Kaes  [Kae88]  based  on  type  kinds,  where  a  type  kind  is  a  universe  of 
types  over  which  a  type  variable  may  be  quantified.  It  proposed  a  restricted  form 
of  overloading  which  is  generally  the  same  restriction  adopted  by  MLo.  However, 
this  type  system  turns  out  to  be  unsound  in  that  it  does  not  enforce  satisfiability  of 
constraint  sets  as  outlined  above.  This  results  in  terms  with  multiple  non-overlapping 
constraints  being  deemed  typable  in  some  instances.  In  the  last  example  of  the 
previous  paragraph,  for  instance,  the  term  M  would  be  deemed  typable.  On  the 
other  hand,  the  similar  work  of  [CD091],  in  an  effort  to  relax  the  restrictions  on 
overloading  in  Kaes  type  system,  enforces  satisfiability  and  hence  remains  sound. 

We  have  shown,  by  example,  the  process  required  to  determine  the  typability 
of  a  term  in  MLo.  This  process  can  be  described  as  a  modification  to  the  concept, 
used  in  system  ML  ,  of  strong  type  inference  [Tiu90].  Formally,  strong  type  inference 
says  that  a  term  M  is  typable  with  respect  to  an  assumption  set  B  if  A  h  Af  :  o  is 
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derivable  for  some  type  a  and  B  C  A.  This  criterion  turns  out  to  be  less  restrictive 
than  required  in  the  presence  of  overloading.  We  are  free,  under  strong  type  inference, 
to  choose  any  assumption  set  A  which  contains  B.  Returning  to  our  derivation,  it 
can  be  seen  that,  in  step  (11),  we  would  have  the  freedom  to  introduce  any  new 
assumptions  we  required  in  order  to  satisfy  typability  under  strong  type  inference, 
resulting  in  untypable  teems  being  deemed  typable.  Strong  type  inference  relies  on 
the  premise  that  assumption  sets  may  contain  at  most  one  assumption  per  identifier. 
This  premise,  of  course,  does  not  hold  in  MLo-  We  then  can  view  typability  in  MLg 
as  being  that  of  strong  type  inference  with  the  requirement  that  B  =  A.  In  other 
words,  B  h  Af  :  <T  must  be  derivable  for  some  type  cr. 
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IV.  TYPE  INFERENCE  IN  SYSTEM  MLo 

An  algorithm,  based  on  W  of  system  ML  ,  has  been  developed  for  MLo  named 
Wg  [Smi91].  In  this  chapter,  we  will  discuss  type  inference  utilizing  Wg,  which  is 
given  in  Figure  4.1. 

Wg  infers  principal  types  for  typable  expressions  in  MLg  ,  failing  on  untypable 
expressions.  Given  assumption  set  A  and  expression  e,  Wg{A^t)  returns 
5  is  a  substitution  such  that  AS  U  B  H  e  :  r  is  derivable.  B  represents  a  set  of 
constraints  on  A,  which  describe  dependencies  associated  with  overloaded  identifiers 
occurring  in  e,  needed  to  arrive  at  a  type  for  e.  Wg,  unlike  W,  utilizes  the  least 
common  generalization  (LCG)  of  an  identifier  overloaded  in  A.  This  concept,  along 
with  the  function  clost{A,B,T)  and  unify{T,T'),  we  will  examine  in  some  detail  in 
this  chapter. 

The  LCG  of  an  overloaded  identifier  can,  perhaps,  be  best  described  by  begin¬ 
ning  with  an  example.  Consider  the  identifier  *,  overloaded  in  A  with  the  assump¬ 
tions  ♦  :  int  -*  int  — »  int,  ♦  :  real  —*■  real  -♦  real  and  *  int  real  —*■  real.  We 
can  see  that  all  of  these  assumptions  have  in  common  second  and  third  arguments 
which  are  identical.  There  is  no  common  groimd  in  their  structure  with  respect  to 
their  first  argiunents.  We  can  describe  their  common  structure  by  the  use  of  two 
quantified  type  variables,  one  for  the  first  argument  and  another  for  the  remaining 
two.  We  would  then  assign  as  the  LCG  of  ♦,  Va,  0.a  —*  —*  0. 

More  formally,  we  can  say  that  a  common  generalization  of  some  set  of  finite 
types  Ti, . . . ,  r„  is  r  if  we  can  apply  some  set  of  substitutions  5i, . . . ,  5„  such  that 
Vt.T5j  =  Tj.  We  further  say  that  t  is  a  least  common  generalization  if,  for  any  other 
generalization  t'  of  r,  there  exists  a  substitution  S  such  that  t'S  =  r.  We  can 
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WoiA  ,  e  )  is  defined  by  cases: 

e  is  I 

if  X  is  overloaded  in  A  with  LCG  Va.r, 
return  ([  ],  {i  :  t5},  tS)  _ 
where  5  =  [a  :=  $]  and  $  are  new 
else  if  (x  :  Vd  with  C  .t)  6  A, 
return  ([  ],CS,tS) 
where  5  =  [d  :=  ;3]  and  $  are  new 
else  fail. 

e  is  Xx.M 

let  (S,  B,  t)  =  Wo(Ax  U  {x  :  a},  M)  where  a  is  new 
return  (5,  B,  aS  -»  r). 

eis  MN 

let  (5,B,t)  = 

let  {S\B\t>)^Wo{AS,N) 

let  5"  =  unify{TS\  t*  -*  a)  where  a  is  new 

return  (55'5",  B5'5"  U  B'5",  aS"). 

e  is  let  X  =  M  in  iV 

let  (5,5,r)  =  Wo{A,M) 

let  (S',  <t)  =  close{AS,  5,  r) 

let  ls\  B",  t’)  =  W^{A^S  U  {x  :  (x),  N) 

return  (55',  U  B",  r'). 

Figure  4.1:  Algorithm 


extend  this  principle  to  constrained  type  schemes  by  applying  the  concept  over  the 
bodies  of  each  constrained  type  scheme.  Least  common  generalizations  are  discussed 
in  [Rey70],  which  gives  an  algorithm  for  computing  them. 

Function  unify  of  Wg  performs  first-order  unification  of  terms  in  expressions. 
In  essence,  unify{T\T")  returns  a  substitution  S  such  that  t'  S  =  t"  S,  and  fails  if 
no  such  substitution  exists.  Formal  discussions  of  unification  are  given  by  Knight 
and  Robinson  in  (Rob65,  Kni89]. 

Function  close  of  takes  as  input  (A,B,t)  and  returns  a  constrained  type 
scheme  for  r.  This  is  accomplished,  essentially,  by  applying  the  (V-intro)  rule  of 
MLa  to  T.  Function  satisfy  within  close  checks  for  satisfiability  of  B  with  respect  to 
A.  The  issue  of  satisfiability  turns  out  to  be  one  of  the  more  interesting  problems  in 
the  MLo  type  system.  We  will  discuss  this  problem,  therefore,  in  detail  later  in  this 
chapter.  Actually,  there  is  latitude  in  how  one  computes  the  closure  of  a  type  in 
Wg.  A  basic  algorithm  for  close  is  given  by  Smith  [Smi91]  which  is  sufficient  in  sup¬ 
porting  his  soundness  and  completeness  proofs  of  Wg,  but  leaves  the  critical  issue 
of  satisfiability  somewhat  unresolved.  Our  implementation  of  Wg  uses  an  algorithm 
developed  by  Volpano  which  incrementally  determines  satisfiability  as  an  expression 
is  being  constructed  [Vol93a].  This  approach  allows  us  to  detect  certain  type  er¬ 
rors,  with  respect  to  constraints,  earlier  than  the  alternative  approach  of  delaying 
satisfiability  checks  until  the  complete  expression  has  been  type  checked. 

We  reproduce  Volpano’s  algorithm  for  c/ose( A,  B,  r)  here  for  the  sake  of  com¬ 
pleteness: 

1.  Let  V  be  the  set  of  all  finite  types  in  B.  For  any  two  types  T\  and  rj  in  V, 

define  an  undirected  edge  if  types  Ti  and  share  a  type  variable,  and 

let  E  be  the  set  of  all  such  edges. 

2.  Let  B'  be  the  set  of  all  constraints  x  :  t'  in  B  for  which  there  is  no  type  r" 
such  that  T*'  contains  a  variable  free  in  A  and  there  is  a  path  from  r'  to  r"  in 
iV,E). 
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3.  If  B'  is  unsatisfiable  under  A  then  fail. 

4.  Let  C  be  the  set  of  all  constraints  x  :  t'  in  B  for  which  there  is  a  type  t"  such 
that  r  and  r"  share  a  type  variable  and  there  is  a  path  from  t'  to  t"  in  (K  £')■ 

5.  Return  {B  —  B\'idi  with  C  .  r),  where  d  are  the  type  variables  free  in  C  or  t 
but  not  A. 

In  steps  (1)  and  (2)  we  define  a  graph  which  connects  constraints  in  B  which 
share  a  type  variable,  and  extract  types  from  B  which  do  not  overlap  on  a  type 
variable.  Set  B'  then  contains  all  of  the  constraints  in  B  which  can  be  eliminated, 
provided  they  are  collectively  satisfiable  with  respect  to  A.  If  we  assume  as  we 
do  in  our  implementation,  that  the  initial  assumption  set  cannot  contain  free  type 
variables,  then  in  the  final  call  to  close  we  are  guaranteed  that  all  constraints  in  B 
will  be  discharged.  This  approach  allows  us  to  perform  satisfiability  checks  in  an 
incremental  manner.  We  do  not  eliminate  a  constraint  from  B  if  it  requires  us  to 
instantiate  a  type  variable  to  some  finite  type;  a  subsequent  term  in  the  expression 
may  require  instantiation  of  that  type  variable,  in  which  case  we  need  to  be  able  to 
ensure  that  previous  overloading  dependencies  are  satisfied.  Consider  the  example, 
slightly  modified,  from  [Vol93aj.  If  we  have  a'^sumption  set: 

b  :  bool  1 

^  _  +  :  mt  — »  int  int,  I 

~  +  :  int  -+  real  —*  real.,  | 

^  =  :  Va.a  — »  a  — ♦  bool  J 

recognizing  that  +  has  LCG  Va.a  — ♦  a  — ♦  a,  say  we  have  the  partial  expression 

Ax.  let  y  =  Xz.pair{z  A  z,z  +  x)  in  <  exp  >  . 

where  <  exp>  represents  a  placeholder.  W^,  in  the  process  of  computing  a  type  for 
y,  makes  the  call, 

close{A  U  {x  :  a},  jB,  7  -» (7  x  a)), 
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where  B  is  the  constraint  set, 


5={+:7-»7->7,  +  :7->a-»a}. 

Function  close  determines  that  B'  is  empty,  since  all  constraints  in  B  share  a  type 
variable,  7.  So  B  is  determined  satisfiable,  and  close  leaves  B  intact  in  returning 

(B.  V7  with  B .  7  — ►  (7  X  a)) . 

Now,  suppose  we  replace  <exp>  with  the  term  x  =  b.  This  determines  the  type  of 
X  to  be  bool.  Wo  now  makes  its  final  call  to  close  for  the  entire  A*expression  as 

close(A,  B,  bool  — ►  bool) 

where 

B  =  {+  ;  7  — »  7  — 7,  +  :  7  — bool  bool} . 

In  our  final  call  to  close,  since  our  initial  assumption  set  contains  no  free  type  vari 
ables,  step  (2)  of  the  algorithm  discharges  all  assumptions  from  B.  This  final  call, 
then,  fails  since  the  second  constraint  on  +  is  unsatisfiable.  In  the  previous  call  to 
close,  if  we  had  discharged  the  constraints  on  +  by  including  them  in  B',  satisfiability 
would  be  decided  by  instantiation  of  7  to  int  and  a  to  real.  As  a  result,  the  final 
call  to  close  would  succeed,  causing  an  untypable  expression  to  be  deemed  typable. 

A.  PARAMETRIC  OVERLOADING  AND 
SATISFIABILITY 

Typability  in  MLo  is  Turing  reducible  to  the  problem  of  deciding  whether  a  set 
of  constraints  is  satisfiable  with  respect  to  a  given  set  of  type  assumptions.  Through 
the  use  of  constrained  type  schemes,  we  can  be  very  expressive  in  representing  over- 
loadings.  It  turns  out  that,  unless  we  restrict  our  representations  to  certain  kinds  of 


I  •.  int  —*  int  —*  real 
/  :  real  — »  real  —*  real 
:  int  —*  int  —*  int 
+  :  real  —*  real  — »  real 


+  : 

Va  with  +  : 

a  —*  a 

a. 

list{a)  -* 

list{a) 

—*■  list{a) 

avg 

;  Va  with  -1- 

:  a  —* 

a  -*•  a,  / 

:  a  — +  a  — ♦  real 

list(a) 

real 

avg 

:  Va  with  -1- 

:  a  —* 

a-*  a,  1 

:  a  —*  a  —*  real 

set{a)  —* 

real 

Figure  4.2:  Infinite  and  recursive  overloadings 

overloadings,  the  problem  of  constraint-set  satisfiability,  and  therefore  typability,  in 
MLo  is  undecidable  [Smi91]. 

Consider  the  assumption  set  in  Figure  4.2.  We  can  see  that  the  assumptions 
on  avg  and  -|-  contain  infinite  overloadings,  e.g.,  -I-  can  assume  a  finite  type,  say 
list{list{. .  ,{list{int)))).  Note  also  the  occurrences  of  recursive  overloadings,  where 
the  satisfiability  of  the  constraint  set  depends  on  the  assumption  itself.  A  mutually 
recursive  overloading  would  result  if  we  added  a  constraint  involving  avg  to  the  third 
assumption  on  -I-. 

Constraint-set  satisfiability  remains  undecidable  in  the  presence  of  mutual  re¬ 
cursion  and/or  straight  recursion  without  restrictions  [Vol94a].  We  should  therefore 
explore  suitable  bounds  on  recursion  which  make  our  satisfiability  problem  decidable. 
We  can  see  that  recursion  is  a  natural  occurrence  in  practice  through  our  example 
in  Figure  4.2.  For  this  reason,  while  it  makes  constraint-set  satisfiability  decidable, 
forbidding  recursion  entirely  is  unacceptable. 

Various  approaches  have  been  examined.  Smith  gives  a  restriction  called  over¬ 
loading  by  constructors  which  makes  constraint-set  satisfiability  decidable  in  polyno¬ 
mial  time  [Smi91].  But  it  disallows  constraints  on  an  overloaded  identifier  x  involving 
y  where  x  ^  y.  This  would  prohibit  the  overloading  on  avg  in  Figure  4.2.  Another 
restriction,  similar  to  that  proposed  by  Kaes  [KaeSS]  and  adopted  by  Haskell  [Has89], 
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is  called  parametric  overloading. 

Parametric  overloading  is  a  more  practical  form  of  overloading  which  allows 
naturally  recursive  overloading  like  that  of  Figure  4.2  and  makes  constraint-set  sat¬ 
isfiability  decidable.  This  is  the  form  that  we  adopt  in  this  thesis. 


22 


Parametric  overloading  makes  use  of  the  concept  of  the  least  common  gener¬ 
alization  of  finite  types  discussed  earlier.  We  give  a  formal  definition  here  from 
[Vol94b]. 

Definition  A..1  Parametric  assumption  sets  are  defined  inductively. 

The  empty  set  is  parametric. 

If  A  is  parametric  with  no  assumption  for  x  and  <t  is  a  constrained  type  scheme 
Vq  with  C .  r  such  that  for  each  z  :  p  ^  C,  z  is  overloaded  in  A  and  p  is  a.  generic 
instance  of  its  LCG  then  /I  U  {x  :  <t}  is  parametric. 

If  A  is  parametric  with  no  assumption  for  x  and  B  is  the  set 
'  X  :  V7i  with  Ci .  rfo  :=  x:i(7i)] 

<  I 

X  :  V7n  with  Cn .  rfo  ;=  Xn(7n)] 
such  that 

•  X  has  LCG  'ia.  r, 

•  Xj  for  *  i  (where  x’s  are  type  constructors  of  various  arities),  and 

•  z  :  p  ^  Ci  implies  that  z  has  LCG  Vtt.  />,  for  some  tt  €  7,,  and  either  z  is 
overloaded  in  A  or  z  =  x, 

then  A  U  ^  is  parametric. 

Note  that  we  can  only  specify  constraints  which  involve  an  overloaded  iden¬ 
tifier;  constraints  involving  finite  types  or  even  polymorphic  types  are  not  allowed 
under  our  definition.  Though  there  are  instances  where  this  limits  the  practical  use 
of  parametric  overloading,  this  restriction  is  generally  not  a  limiting  factor  in  prac¬ 
tice.  Smith  has  considered  approaches  to  relaxing  this  particiilar  restriction  for  type 
checking  a  language  with  subtyping  and  overloading  [Smi91,  Smi93].  This  thesis, 
however,  considers  overloading  only.  We  can  also  make  the  observation  that  an  iden¬ 
tifier  X  parametrically  overloaded  in  A  can  always  be  characterized  by  an  LCGvfhich 
has  only  one  quamtified  variable.  This  gives  us  a  practical  view  of  the  restrictions  we 
are  talking  about. 
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■Eo: 

ini. 

real,  bool 

E  = 

Si: 

list. 

ref 

.E,: 

map. 

pair 

Figure  4.3:  Type  constructors  of  various  arities 

We  can  characterize  parametric  overloadings  as  a  regular  forest  of  trees  [Vol94b] . 
These  regular  forests  can  be  generated  by  a  class  of  context-free  grammars  called 
regular  tree  grammars  [GeS84].  If  A  is  parametric  then  every  overloaded  identifier 
X  in  A  has  an  LCG  of  the  form  Va.r  and  the  set  of  finite  types  ir  to  which  a  can  be 
instantiated,  meaning  A\-  x  :  t[o  :=  k]  is  derivable,  form  a  regular  tree  language  or 
forest. 

B.  SATISFIABILITY  ALGORITHM 

The  determination  of  constraint-set  satisfiability,  which  is  computed  by  the 
function  satisfiable{A,C),  takes  the  assumption  set  A  and  the  constraint  set  C  as 
inputs.  For  any  parametric  assiunption  set  A,  we  can  construct  for  every  overloaded 
identifier  i  a  regular  tree  grammar  G*  such  that  if  x  has  LCG  Va.  t  then  for  any 
variable-free  finite  type  r',  we  can  derive  A  H  i  :  r[a  :=  t']  if  an  only  if  r'  G 
L{Gg),  where  L{Gg)  represents  the  regular  tree  language  generated  by  G*.  In  this 
context,  we  need  only  parse  t*  with  respect  to  L{Gg)  to  determine  whether  constraint 
X  :  r[a  :=  r']  is  satisfiable  with  respect  to  A. 

An  algorithm  for  satisfiability  has  been  developed  based  on  the  property  that 
regular  forests  are  effectively  closed  under  intersection  [Vol94b].  Our  implementation 
of  Wo  uses  this  algorithm.  Consider  an  example  using  the  parametric  assumption  set 
of  Figure  4.2  and  the  type  constructors  in  Figure  4.3,  which  includes  constructors  of 
arity-0,1  and  2. 
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We  can  see  that  /,  +  and  avg  are  overloaded  in  A  with  respective  ICG's: 
Vo.  o  — ♦  o  — ♦  nal,  Vo.  a  -+  o  — »  o  and  Vo.  o  —*  real.  Our  first  task  is  to  construct 
a  dependency  graph  of  assumptions  in  A;  if  an  assumption  on  x  contains  a  constraint 
on  y  we  need  to  produce  the  grammar  of  y  before  we  produce  x’s  grammar.  We  then 
can  proceed  to  create  regular  tree  grammars  for  each  overloaded  identifier  in  A  based 
on  dependencies.  We  see  that  avg  depends  on  /  and  +  in  A  so  we  must  compute 
the  grammar  for  avg  last. 

Since  identifiers  may  be  overloaded  recursively,  as  in  our  example,  we  will  rep¬ 
resent  occurrences  of  an  identifier  x  in  its  own  constraint  list  with  the  start  symbol 
for  Gf .  In  the  case  of  constrained  type  schemes  with  multiple  constraints,  as  occurs 
in  avg,  we  will  represent  this  as  a  new  non-terminal.  This  non-terminal  will  define 
new  productions  for  the  grammar  which  result  from  the  computed  intersection  of  the 
constraints.  Given  a  constraint  set  which  contains  a  constraint  on  x  and  a  constraint 
on  y  there  intersection  is  computed  as  L(G*)  n  L{Gy). 

We  represent  the  type  constructors  in  S  as  a  grammar  G^.  We  can  then  take 
advantage  of  the  fact  that  L(G^)  fl  L{Gx)  =  £(Gx)  for  any  overloaded  identifier  x  as 
we  construct  our  grammars  for  A.  We  therefore  obtain  the  following  set  of  grammars 
for  our  example  assumption  set  A; 


Gz  : 

S  =  int 

rcfiS) 

1 

1 

real 

S-^S 

bool 

pair{S,  S) 

list{S)  1  ' 

G,: 

A  = 

int 

1 

real 

G+  : 

B  = 

int 

1 

real 

list{B) 

Gavg  ' 

C=  list(D) 
D  =  int 

1 

1 

3et{D) 

real 

4 

where  the  non-terminal  D  represents  L{G/)  n  L{G.^). 

In  our  batch  implementation  of  W^,  where  we  do  not  allow  occurrences  of  free 
type  variables  in  the  initial  assumption  set,  we  can  create  the  set  of  regular  tree 
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grammars  once  and  reuse  the  representation.  This  was  the  approach  we  adopted  in 
our  implementation. 

We  can  now  determine  satisfiability  of  a  constraint  set  C  with  relation  to  an 
assumption  set  B  by  parsing  each  constraint  in  C,  of  the  form  id  :  t,  with  respect 
to  the  grammars  computed  for  B  i.e.  if  t  parses  with  respect  to  L{Gii)  for  each 
constraint  in  C  then  C  is  satishable.  It  is  possible,  though,  that  we  may  encounter 
overlapping  constraints  in  C.  In  this  case  we  must  first  compute  the  intersections  on 
any  overlapping  constraints  before  parsing  those  that  don’t  overlap.  If  the  computed 
intersection  is  empty  then  C  is  unsatisfiable.  An  intersection  is  empty  if  there  exists 
no  common  type  constructor  of  arity-0  between  constraints.  For  example,  grammar 
G  below  represents  an  empty  intersection. 

G  =  list{G)  1  rt}{G) 

This  algorithm  is  exponential  in  the  number  of  forests  input,  but  this  is  very 
likely  the  best  we  can  do  for  the  problem  has  been  shown  NP-complete  [Vol94b].  The 
use  of  our  implementation  of  Wo  should  provide  valuable  insight  into  determining 
whether  the  NP  lower  bound  for  constraint  set  satisfiability  is  a  practical  limitation. 
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V.  IMPLEMENTATION  OF  Wo 


As  we  have  shown,  algorithm  Wg  has  been  developed  to  infer  the  most  general 
type  of  a  term  given  suitable  forms  of  overloading.  We  envision  an  interactive  pro¬ 
gramming  environment  in  which  incomplete  expressions  are  type  checked  (may  have 
placeholder  terms)  and  can  be  subsequently  updated,  perhaps  requiring  new  types 
to  be  inferred. 

In  this  setting,  Wg  is  unsuitable  because  it  is  not  incremental.  If  a  function,  say 
/,  is  computed  on  input  x,  then  on  input  change  A,  we  say  that  the  computation  of 
f{x  +  A)  is  incremental  if  /(x-H  A)  is  computed  from  only  f(x)  and  A.  Although  our 
implementation  does  not  type  check  definitions,  it  nonetheless  exhibits  incremental 
type  re-computation  at  the  expression  level,  as  we  will  show. 

In  efforts  to  develop  an  incremental  approach  to  type  inference,  we  might  at¬ 
tempt  to  re-write  Wg.  We  have,  however  chosen  an  approach  which  makes  use  of  a  for¬ 
malism,  namely  attribute  grammars,  for  achieving  incremental  type  re-computation. 
Utilizing  this  formalism  we  foresee  our  implementation  not  only  providing  a  means 
to  validate  and  explore  bounds  on  the  problem  of  type  inference  in  the  presence  of 
overloading,  but  also  as  a  step  towards  integrating  incremental  algorithms  for  on-line 
type  inference  and  thoae  for  overloading. 

A.  THE  ROLE  OF  ATTRIBUTE  GRAMMARS 

Updating  expressions  affords  an  opportimity  to  re-use  previous  type  compu¬ 
tation.  The  attribute  grammar  formalism  provides  a  framework  in  which  type  re¬ 
computation  is  identified  with  attribute  re-eviluation.  So  if  attribute  re-evaluation  is 
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done  incrementally  then  type  re-computation  is  incremental  as  well  and  furthermore 
it  is  implicit  in  the  formalism. 

Using  an  attribute  grammar  we  can  specify  the  syntax  of  a  language  via  a 
context-free  grammar.  Nodes  of  parse  trees  are  annotated  with  attributes  that  are 
prescribed  by  a  set  of  attribute  equations  given  as  part  of  the  attribute  grammar.  If  a 
parse  tree  is  edited  then  attributes  of  the  tree  are  re-computed  using  the  equations  so 
that  a  consistent  attribution  is  maintained.  Re-computing  the  attributes  is  implicit 
and  is  done  by  the  attribute  evaluator. 

The  productions  of  the  context-free  grammar  for  type  inference  in  MLo  which 
we  have  developed  for  our  implementation  are  given  in  Figure  5.1.  Non-terminals  are 
represented  in  upper  case  while  terminals  are  in  lower  case.  Terminals  in  productions 
that  begin  vrith  N\dl  represent  placeholder  terms  which  have  universal  type  Va.or. 

Attributes  are  distinguished  as  either  synthesized  or  inherited.  Synthesized  at¬ 
tributes  occur  on  the  left-hand  side  of  attribute  equations;  inherited  attributes  occur 
on  the  right-hand  side.  In  other  words,  in  one  case  attributes  are  propagated  up 
(synthesized)  in  the  parse  tree  and  in  the  other  they  are  propagated  down  (inher¬ 
ited)  in  the  parse  tree.  Figure  5.2  shows  the  inherited  (AI)  and  synthesized  (AS) 
attributes  2issociated  with  the  productions  of  Figure  5.1. 

To  implement  Wo,  we  define  attribute  equations,  which  create  dependencies 
between  attribute  values.  As  the  derivation  tree  is  updated  these  dependencies  de¬ 
termine  what  part  of  the  tree  is  affected  and  where  selective  re-computation,  via  the 
attribute  equations,  needs  to  be  done  in  order  to  re-establish  consistent  attribute 
values  throughout  the  tree.  The  set  of  attribute  equations  in  Figure  5.3  then  defines 
the  dependencies  required  in  each  attributed  production  from  Figure  5.1  to  imple¬ 
ment  Wg.  Functional  support,  indicated  by  italics,  is  simplified  and  represented  by 
descriptive  function  names.  The  attributes  5  and  B  of  EXP  are  precisely  those  terms 
returned  by  Wg  as  discussed  in  Chapter  IV. 
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(1)  TOPLEVEL  ^  ASSUMPTIONSET  EXPLIST 

(2)  TOPLEVEL  ^  NuUPrgrm 

(3)  ASSUMPTIONSET  -  ASSUMPTIONLIST 

(4)  ASSUMPTIONSET  -♦  NuUAssumptions 

(5)  ASSUMPTIONLISTi  ASSUMPTION  ASSUMPTIONLIST 2 

(6)  ASSUMPTIONLIST  ^  NuUAssumption 

(7)  ASSUMPTION  -  ID  TYPESCHEMELIST 

(8)  ID  ^  id 

(9)  ID  -  NuUId 

(10)  TYPESCHEMELISTi  ^  TYPESCHEME  TYPESCHEMELIST, 

(11)  TYPESCHEMELIST  -  NuUTypeSchemeList 

(12)  TYPESCHEME  ^  TYPEVARLIST  CONSTRAINTLIST  TYPEEXP 

(13)  TYPESCHEME  NuUTypeScheme 

(14)  TYPEVARLISTi  -  QUANTTYPEVAR  TYPEVARLIST, 

(15)  TYPEVARLIST  ^  NuUTypeVarList 

(16)  QUANTTYPEVAR  -  TypeVar 

(17)  QUANTTYPEVAR  NuUTypeVar 

(18)  CONSTRAINTLIST  1  -  CONSTRAINT  CONSTRAINTLIST, 

(19)  CONSTRAINTLIST  -  NullConstraintyst 

(20)  CONSTRAINT  -  ID  TYPEEXP 

(21)  CONSTRAINT  -  NtmConstraint 

(22)  TYPEEXP  -  UniversalType 

(23)  TYPEEXP  -  Int 

(24)  TYPEEXP  -  Real 

(25)  TYPEEXP  -  Bool 

(26)  TYPEEXP  -  TypeVar 

(27)  TYPEEXP  -  NtiUType 

(28)  TYPEEXPi  -  Map(TYPEEXP3  TYPEEXP3) 

(29)  TYPEEXPi  -  Pair(TYPEEXP3  TYPEEXP3) 

(30)  TYPEEXPi  -  Li8t(TYPEEXP3) 

(31)  TYPEEXPi  -  Seq(TYPEEXP3) 

(32)  TYPEEXPi  -  Ref(TYPEEXP3) 

(33)  EXPUSTi  -  EXP  EXPLIST3 

(34)  EXPLIST  —  NnllExpression 

(35)  EXP  —  ID 

(36)  EXPi  -  EXP3  EXP3 

(37)  EXPi  -  A  ID.EXP3 

(38)  EXPi  -  let  ID  =  EXP3  in  EXP3 

(39)  EXP  -  NallExp 

Figure  5.1:  Context-free  grammar  for  M£t>type  inference 


AItOPLBVEL  =  {}  AStoPLEVEL  ~  {} 

AIassumptionset  =  {}  AS  assumption  SET  =  {typeEnv} 

AI assumption  LIST  =  {}  ASassumptionlist  =  {typeEnv} 

AIassumption  =  {}  ASassumption  =  {typeEnv} 

AIw  =  {}  ASid  =  {name} 

AIexplist  =  {typeEnv, typeGrammar)  ASexpust  =  {} 

AIexp  =  {typeEnv, typeGrammar}  ASexp  =  {S,B,typeAssignment} 


Figure  5.2:  Inherited  and  synthesized  attributes  of  implementation  grammar 

To  illustrate  how  incremental  type  recomputation  is  achieved  via  incremental 
attribute  evaluation,  consider  Figure  5.4.  Here  we  have  a  partial  derivation  tree 
annotated  with  a  dependence  graph  showing  the  propagation  of  attributes  in  the 
tree.  For  simplicity,  we  have  chosen  one  inherited  attribute  and  one  synthesized 
attribute.  The  inherited  attribute  A  represents  an  aissumption  set.  The  synthesized 
attribute  T  is  a  constrained  type  scheme  representing  the  type  of  an  expression 
at  each  node  of  the  tree.  Figure  5.4  represents  the  partial  derivation  tree  for  the 
expression  pr{x,Xy.Xz.  y  2),  where  pr  is  of  type  Va,  /3.  a  -*  /?  -♦  pair{a,0). 
Suppose  the  expression  rooted  at  node  is  updated.  We  can  see  that  T  at  node  rii 
now  must  be  recomputed  but  notice  that  no  chamge  has  been  made  to  the  expression 
rooted  at  node  which  therefore  need  not  be  retypechecked.  In  practice,  this  can 
result  in  significant  savings  as  the  tree  whose  root  is  can  be  arbitrarily  large. 

1.  The  Synthesizer  Generator  Platform 

An  attribute  evaluator  generator  takes  as  input  a  set  of  attribute  equations, 
such  as  those  in  Figure  5.3,  for  a  set  of  terms  and  outputs  an  attribute  evaluator  that 
takes  a  term  2uad  annotates  it  with  an  attribution  as  prescribed  by  the  equations. 
There  are  attribute  evaluator  generators  available  today  that  not  only  output  an 
attribute  evaluator  but  output  one  that  evaluates  attributes  incrementally.  One 
such  generator  is  GrammaTech’s  Synthesizer  Generator  (SynGen). 
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(1)  EXPLIST.typeEnv  =  IniUalEnv()  ®  ASSUMPTIONSET.typeEnv 

EXPLIST.typeGramm&r  =  CompiiteGrammor(ASSUMPTIONSET.typeEnv) 

(3)  ASSUMPTIONSET.typeEnv  =  ASSUMPTIONLIST.typeEnv 

(4)  ASSUMPTIONSET.typeEnv  =  NuUTypeEnv 

(5)  ASSUMPTlONLISTi.typeEnv  =  Concflt£:n»(( ASSUMPTION. name,  ASSUMPTION.type), 

ASSUMPTIONLISTi.typeEnv) 

(6)  ASSUMPTIONLIST.typeEnv  =  NuUTypeEnv 

(7)  ASSUMPTION.name  =  ID.name 
ASSUMPTION.type  =  TYPESCHEMELIST 

(8)  ID.name  =  id 

(9)  ID.name  =  ’’undeclared” 

(33)  EXP.typeEnv  =  EXPLIST  i  .typeEnv 

EXP.typeGrammar  =  EXPLISTi.typeGrammar 
EXPLISTj. typeEnv  =  EXPLISTi. typeEnv 
EXPLISTa.typeGrammar  =  EXPLISTi.typeGrammar 

(35)  EXP.typeAssignment  =  Compute Type(ID.name,  EXP.typeEnv) 

(36)  EXPi.S  =  let  V  =  ( Uni/y(EXP3.S  EXPj.typeAssignment), 

(EXPa.typeAssignment  — ►  iVeu;Var(beta)))  in 
V  (EXP3.S  EXPj.S) 

EXPi.typeAssignment  =  V  beta 
EXPi.B  =  (V  (EXP3.S  EXPj.B))  ® 

(V  EXP3.B) 

EXPj. typeEnv  =  EXPi  .typeEnv 
EXPa-typeEnv  =  EXPj.S  EXPi .typeEnv 
EXPj.typeGrammar  =  EXPi.typeGrammar 
EXP3.typeGtammar  =  EXPi.typeGrammar 

(37)  EXPi.typeAssignment  =  (EXPj.S  JVewVar^beta))  — ►  EXPj.typeAssignment 
EXPiS  =  EXPj.S 

EXPi.B  =  EXPj.B 

EXPj.typeEnv  =  Concat£'n«((ID.name,  beta),  EXPi  .typeEnv) 

EXPj.typeGrammar  =  EXPi.typeGrammar 

(38)  let  (B’,  <r)  =  Close{  (EXPj.S  EXPi  .typeEnv),  EXPj.B, 

EXPj.typeAssignment,  EXPi.typeGrammar) 

EXPi.typeAssignment  =  EXPs.typeAssignment 
EXP1.S  =  (EXP3.S  EXP3.S) 

EXPi.B  s  (EXP3.S  B*)  «  EXP3  B 
EXPj.typeEnv  =  EXPj.typeEnv 

EXP3.typeEnv  =  ConcstEno((ID.name,  <r),  (EXPj.S  EXPj.typeEnv)) 

EXPj.typeGrammar  =  EXPj.typeGrammar 
EXPa-typeGrammar  =  EXPj.typeGrammar 

(39)  EXP.typeAssignment  =  I\rewPar(beta) 

EXP.S  =  NuUSubst 

EXP.B  =  NullConstraintList 

Figiire  5.3;  Attribute  equations  for  MLo  type  inference 
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Ml  EXP  inh  A  syn  T 


Tij  APP  inh  A  syn  T 


n3  APP  inh  A  syn  T  EXP  inh  A  syn  T 


pr  X  (Ay.Az.yz) 


Figure  5.4:  A  partial  derivation  tree  and  dependence  graph  for  pr  (z,  A  y  .A  z .  y  z) 
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We  have  developed  our  implementation  utilizing  SynGen  for  several  reasons. 
We  au'e  able  to  get  a  comprehensive  and  visually  appealing  X-windows  interface  with 
relative  ease.  Utilizing  Syngen  also  fits  nicely  with  our  opinion  of  attribute  grammars 
as  being  a  desirable  approach  to  achieving  incremental  type  inference.  Furthermore, 
since  we  profit  by  the  incremental  algorithms  embedded  in  SynGen,  new  advances 
in  this  area,  which  may  well  be  incorporated  in  future  versions,  will  directly  enhance 
our  implementation. 

The  incremental  algorithms  used  in  SynGen  rely  heavily  on  the  concept  of 
ordered  attribute  grammars  which  were  introduced  in  [KaisSO].  The  ordered  attribute 
grammars  are  a  subclass  of  the  noncircular  attribute  grammars.  Though  SynGen 
can  accept  attribute  grammars  which  are  not  ordered,  it  prohibits  circular  attribute 
grammars. 

The  language  of  SynGen  is  SSL.  Every  (useful)  SSL  specification  hais  three 
major  declaration  areas:  Abstract  syntax  which  defines  a  set  of  grammar  rules.  At¬ 
tribution  which  annotates  the  grammar  with  attributes  and  describes  their  depen¬ 
dencies,  and  Unparsing  which  defines  display  formats  for  terms,  identifies  selectable 
productions  of  the  grammar  and  annotates  which  productions  are  editable.  For  our 
implementation,  Figure  5.1  represents  the  Abstract  syntax  and  Figures  5.2  and  5.3 
represent  the  Attribution. 

2.  The  Implementation 

We  de^iionstrate  our  implementation  through  an  annotated  sequence  of 
actual  X-windows  display  screens  generated  by  our  type  checker.  Figure  5.5  shows 
an  initial  screen  with  placeholders  for  an  assumption  set  entry,  where  we  define 
extensions  to  an  Initial  Environment,  amd  an  expression.  The  currently  selected  term, 
corresponding  to  the  ASSUMPTIONSET  production  of  our  grammar,  is  underlined. 
Note  that  the  type  inferred  for  the  placeholder  term  <exp>  is  <universal  type>. 
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Figure  5.5:  Implementation  initial  screen. 
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Figure  5.6:  An  assumption  set  defined. 

We  have  entered  an  assumption  set  in  Figure  5.6  with  three  overloaded 
identifiers.  The  first  type  scheme  for  each  identifier,  without  the  constraints,  must 
represent  the  LOG  of  that  identifier.  The  implementation  currently  does  not  com¬ 
pute  the  LCG  and  so  it  must  be  provided  by  the  user.  Note  the  terms  enclosed  in 
boxes  at  the  bottom  of  the  screen.  These  are  called  transforms.  With  the  placeholder 
for  TYPEEXP  selected,  we  may  select  a  transform  with  the  mouse  and  replace  the 
selected  placeholder  term  with  a  term  associated  with  the  transform.  This  provides 
an  alternative  means  to  enter  terms  without  the  need  for  the  user  to  remember  the 
appropriate  syntax.  Users  may  also  enter  terms  directly  as  long  as  the  term  being 
edited  is  defined  in  the  unparsing  rules. 
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In  Figure  5.7  we  have  entered  three  expressions  whose  types  have  been 

inferred.  The  type  of  our  first  expression  is  represented  by  a  constrained  type  scheme; 

it  is  the  most  general  type  we  can  give  to  it  and  we  can  be  no  more  specific  without 

more  information.  In  the  second  expression,  where  r  :  real  is  defined  in  the  initial 

environment,  we  see  that,  since  mult  is  defined  over  reals,  applying  expon  to  r  satisfies 

the  constraint  on  expon  and  we  are  able  to  infer  a  finite  type  for  the  expression.  An 

unsatisfiable  constraint  has  been  encountered  in  our  final  expression.  This  is  a  result 

of  the  multiple  constraints  on  mult  and  eq?  in  the  third  assumption  of  mult.  We  can 

see  that  the  grammar  for  mult  is: 

Gmuit  •  S  =  int  I  real  |  list{U) 

U  =  int  I  list{U) 

which  clearly  does  not  derive  list{real). 

It  is  also  possible  to  directly  examine  the  attributes  of  the  parse  tree  at 
any  point  in  the  execution.  This  functionality,  though  mainly  useful  for  debugging, 
can  provide  a  means  to  investigate  aspects  of  the  implementation  from  a  lower  level 
viewpoint.  For  example,  one  might  wish  to  examine  a  representation  of  the  regular 
tree  grammars  produced  for  overloaded  identifiers  in  a  given  assumption  set.  This 
can  be  done  by  examining  the  attribute  typeGrammar  at  any  EXP  node  of  the 
parse  tree.  For  instance.  Figure  5.8  shows  the  regular  tree  grammars  computed  for 
the  assumption  set  of  Figure  5.7.  Note  that  we  have  chosen  to  represent  the  start 
symbol  of  grammar  Ga  as  id,  for  each  overloaded  identifier  id  in  the  assumption 
set.  In  addition,  id\.*Jd2  was  chosen  to  represent  L{Gii,)  D  L(Gid^). 

We  have  given  a  brief  overview,  through  examples,  of  the  X-windows  inter¬ 
face  and  general  functionality  of  our  implementation  of  Wo  with  parametric  over¬ 
loading.  By  examining  instances  of  type  inference  in  MLo,  in  the  setting  of  our 
implementation,  we  have  endeavored  to  provide  the  reader  with  a  clearer  under¬ 
standing  of  concepts  discussed  more  formally  in  previous  chapters. 
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•Bsumpthcsis.s 


Read/tmp.mntWgeminUMMffcibuinhesIs/ssVassunipthesIs.s 


ASSUMPTIONS: 

cxpon:  V  a  with  (mult:(a  -*  (a  -» a))) .  (a  (int  -» a)); 
eq?: Vawith  (eq?:(a— *(a-*bool))) . (a -♦(a-* bool)); 

:  Vawith  ().(lut(a)-*(lut(a)-»booI)); 

:  (int  -*  (int  — » bool)); 

mult: Vawith  (rault:(a-*(a— »a))).(a— »(a-»a)); 

:  (int— *  (int— *  int)); 

:  (rod  -» (real  -»real)X' 

:  Vawith  (mult(a— *(a-*a)),eq?:(a-*(a— »bool))).(li«t(a)— *(liit(a)— »lift(a))) 


EXPRESSIONS: 


TYPE;  V(a)with  (mult;(a-»(a-ia))).(a-*(a-*a)); 
cDqnnr 

TYPE:  (int-Hcal); 

lstr)(liftr) 

<conitrainterror>  -> 

(niult:(^real)— i(iiat(re«i)-4kt(reil)))  ) 
iitmaathfiable. 
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IH  »«how»  (r««d~only) 


H  ■  I 


Z:  (int  I  bool  I  ml  i  lin(Z)  I  Mq(Z)  I  rrf(Z)  I  (Z  Z)  I  pur(Z,^); 
eq?;(]iit(Z)lait); 


lilt:  (int  I  ml  I  list(eq?_*_inult)); 
pon:  (milt); 

[7_*_iault:  (Uit(eq?_'*  joult)  I  int); 


Figure  5.8:  Representation  of  attribute  typcGrammar 
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VI.  CONCLUSIONS 


We  have  considered  the  problem  of  type  inference  in  an  extension  to  the  type 
system  ML  called  MLg.  The  type  system  MLq  is  a  formalism  which  is  more  suitable 
for  implementing  future  languages  by  virtue  of  its  incorporation  of  global  overloading. 
Yet  this  increased  functionality  introduces  new  problems  in  developing  algorithms 
which  make  typability  decidable.  Without  restrictions  on  the  types  of  overloadings 
and  the  structure  of  constraint  sets,  typability  in  MLo  is  undecidable.  Typability 
in  MLo  is  Turing  reducible  to  the  problem  of  determining  if  a  set  of  constraints 
is  satishable  with  respect  to  a  given  set  of  assumptions.  If  assumption  sets  are 
restricted  to  parametric  overloadings  the  problem  of  constraint  set  satisfiability  is 
NP-complete. 

The  type  inference  algorithm  Wo  with  parametric  overloading  has  been  imple¬ 
mented  utilizing  the  formalism  of  attribute  grammars  with  GrammaTech’s  Synthe¬ 
sizer  Generator.  It  performs  type  inference  on  expressions  in  an  interactive  envi¬ 
ronment.  Type  inference  is  performed  incrementally  so  that  the  types  of  partial 
expressions  can  be  inferred  and  efficiency  of  re-computation  in  the  presence  of  up¬ 
dates  is  enhanced.  Consequently,  immediate  feedback  is  provided  to  the  user  as 
expressions  are  entered  and  updated. 

Our  implementation  will  be  used  to  examine  the  practical  bounds  on  the  prob¬ 
lem  of  constraint-set  satisfiability.  It  will  also  represent  a  significant  tool  for  explor¬ 
ing  the  limits  of  bounded  polymorphism,  or  overloading,  in  programming  languages. 
Can  we  devise  new  forms  of  overloading  which  are  more  flexible  than  parametric 
overloading  yet  retain  a  decidable  satisfiability  problem? 
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A.  FUTURE  WORK 


This  thesis  will  serve  as  a  basis  for  further  research  aimed  at  ultimately  de¬ 
veloping  a  type  discipline  for  a  class  of  implicitly-typed  imperative  programming 
languages  with  subtypes,  overloading  and  polymorphism.  A  more  immediate  goal 
is  to  merge  our  implementation  of  Wa  with  an  SSL  implementation  that  performs 
on-line  type  inference  utilizing  the  type  inference  algorithm  W  of  ML  . 

On-line  type  inference  allows  the  introduction  of  new  global  definitions  as  a 
program  is  produced.  This  differs  from  our  batch  implementation,  where  we  have 
assumptions  about  types  of  free  ids  available  to  each  expression  in  the  form  of  an 
assumption  set.  The  incorporation  of  overloading  in  an  on-line  implementation  will 
be  the  subject  of  the  next  step  in  this  research  effort.  This  will  produce  an  interactive 
environment  where  global  definitions,  perhaps  overloaded,  may  be  introduced  at  any 
point  in  the  program.  Types  of  all  dependent  terms  are  then  recomputed  as  a  result 
of  these  new  definitions. 
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